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Abstract This paper derives new algorithms for signomial programming, a general- 
ization of geometric programming. The algorithms are based on a generic principle for 
optimization called the MM algorithm. In this setting, one can apply the geometric- 
arithmetic mean inequality and a supporting hyperplane inequality to create a surro- 
gate function with parameters separated. Thus, unconstrained signomial programming 
reduces to a sequence of one-dimensional minimization problems. Simple examples 
demonstrate that the MM algorithm derived can converge to a boundary point or to 
one point of a continuum of minimum points. Conditions under which the minimum 
! /" , ■ point is unique or occurs in the interior of parameter space are proved for geometric 

programming. Convergence to an interior point occurs at a linear rate. Finally, the 
MM framework easily accommodates equality and inequality constraints of signomial 
type. For the most important special case, constrained quadratic programming, the 
MM algorithm involves very simple updates. 

Keywords arithmetic-geometric mean inequality • global convergence ■ MM 
algorithm ■ parameter separation ■ penalty method 
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1 Introduction 

As a branch of convex optimization theory, geometric programming is next in line to 
linear and quadratic programming in importance 3,4,14,15 . It has applications in 
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chemical equilibrium problems [13] . structural mechanics J], integrated circuit design 
[6], maximum likelihood estimation |11| . stochastic processes [5], and a host of other 
subjects [I]. Geometric programming deals with posynomials, which are functions of 
the form 

n 

Here the index set S C R™ is finite, and all coefficients c a and all components x\ , . . . , x n 
of the argument x of f(x) are positive. The possibly fractional powers a^ corresponding 
to a particular a may be positive, negative, or zero. For instance, %± + 2x\x^ is a 
posynomial on K .In geometric programming we minimize a posynomial f(x) subject 
to posynomial inequality constraints of the form Uj(x) < 1 for 1 < j < q, where the 
uj(x) are again posynomials. In some versions of geometric programming, equality 
constraints of posynomial type are permitted [2]- 

A signomial function has the same form as the posynomial (JXJ), but the coefficients 
c a are allowed to be negative. A signomial program is a generalization of a geomet- 
ric program, where the objective and constraint functions can be signomials. From a 
computational point of view, signomial programming problems are significantly harder 
to solve than geometric programming problems. After suitable change of variables, a 
geometric program can be transformed into a convex optimization problem and glob- 
ally solved by standard methods. In contrast, signomials may have many local minima. 
Wang et al. [19] recently derived a path algorithm for solving unconstrained signomial 
programs. 

The theory and practice of geometric programming has been stable for a genera- 
tion, so it is hard to imagine saying anything novel about either. The attractions of 
geometric programming include its beautiful duality theory and its connections with 
the arithmetic-geometric mean inequality. The present paper derives new algorithms 
for both geometric and signomial programming based on a generic device for iterative 
optimization called the MM algorithm [8lll0|. The MM perspective possesses several ad- 
vantages. First it provides a unified framework for solving both geometric and signomial 
programs. The algorithms derived here operate by separating parameters and reducing 
minimization of the objective function to a sequence of one-dimensional minimization 
problems. Separation of parameters is apt to be an advantage in high-dimensional prob- 
lems. Another advantage is ease of implementation compared to competing methods 
of unconstrained geometric and signomial programming [T5]. Finally, straightforward 
generalizations of our MM algorithms extend beyond signomial programming. 

We conclude this introduction by sketching a roadmap to the rest of the paper. Sec- 
tion [5] reviews the MM algorithm. Section [3] derives MM algorithm for unconstrained 
signomial program from two simple inequalities. The behavior of the MM algorithm is 
illustrated on a few numerical examples in Section [4] Section [5] extends the MM algo- 
rithm for unconstrained problems to the constrained cases using the penalty method. 
Section [S] specializes to linearly constrained quadratic programming on the positive 
orthant. Convergence results are discussed in Section [7] 



2 Background on the MM Algorithm 

The MM principle involves majorizing the objective function f(x) by a surrogate func- 
tion g(x | x m ) around the current iterate x m (with ith component x % 



Majorization is defined by the two conditions 

f(x m ) = g(x m j x m ) (2) 

f(x) < g(x | x m ) , x -£ x m . 

In other words, the surface x t-¥ g(x \ x m ) lies above the surface x t-¥ f(x) and is 
tangent to it at the point x = x m . Construction of the majorizing function g(x \ x m ) 
constitutes the first M of the MM algorithm. 

The second M of the algorithm minimizes the surrogate g(x j x m ) rather than 
f(x). If x m +i denotes the minimizer of g(x \ x m ), then this action forces the descent 
property f(x m +i) < f(x m ). This fact follows from the inequalities 

f{x m +l) < 9(Xm+l I X m ) < g(x m | X m ) = f(x m ), 

reflecting the definition of x m +i and the tangency conditions l[2jl. The descent property 
lends the MM algorithm remarkable numerical stability. Strictly speaking, it depends 
only on decreasing g(x \ x m ), not on minimizing g(x \ x m )- 



3 Unconstrained Signomial Programming 

The art in devising an MM algorithm revolves around intelligent choice of the ma- 
jorizing function. For signomial programming problems, fortunately one can invoke 
two simple inequalities. For terms with positive coefficients c a , we use the arithmetic- 
geometric mean inequality 

n n 

lR*<Ear! Hl1 ( 3 ) 

i=l i=\ " " 

for nonnegative numbers Zi and ai and l\ norm ||a||i = Y2i=l ! a *l [18J - If we niake 
the choice Zj = Xi/x m i in inequality ([3}, then the majorization 

n / n \ n , v ||a||i 

emerges, with equality when x = x m . We can broaden the scope of the majorization 
@ to cases with on < by replacing z, L by the reciprocal ratio x m i/xi whenever on < 0. 
Thus, for terms c a Yli=i x f* w ith c a > 0, we have the majorization 

IHIisgn(ai) 
mj | Z^ \\ a \\i \ Xm . j 



-n^.- n-:, es i 



where sgn(a.j) is the sign function. 

The terms c a Yii—i X T' with c a < are handled by a different majorization. Our 
point of departure is the supporting hyperplane minorization 

z > 1 + In z 



at the point z = 1. If we let z = Yli= 1 ( x i/$mi ) <Xi ■> then it follows that 

n n / n n \ 

n a: ? i - n x mj i 1 + y ai in x * - y ai in Xmi ^ 

i=l j=l V i=l i=l / 

is a valid minorization in x around the point x m - Multiplication by the negative co- 
efficient c a now gives the desired majorization. The surrogate function separates pa- 
rameters and is convex when all of the Oj are positive. 

In summary, the objective function (JXJ) is majorized up to an irrelevant additive 
constant by the sum 



|isgn(ai) 

(6) 
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where S+ = {a : c a > 0}, and S- = {a : c a < 0}. To guarantee that the next 
iterate is well defined and occurs on the interior of the parameter domain, it is helpful 
to assume for each i that at least one a. G S+ has aj positive and at least one a G S+ 
has ai negative. Under these conditions each gi(xi \ x m ) is coercive and attains its 
minimum on the open interval (0, oo). 

Minimization of the majorizing function is straightforward because the surrogate 
functions gi(xi \ x m ) are univariate functions. The derivative of gi(xi \ x m ) with 
respect to its left argument equals 

i r > H a lli s S n ( Q i) 

g'i(xi | x m ) = V c a [ Jlx^j \aiX~ 1 — * 

+ £ Ca Il^mi ja^r 1 

aeS- \]=1 / 

Assuming that the exponents a% are integers, this is a rational function of Xi, and once 
we equate it to 0, we are faced with solving a polynomial equation. This task can be 
accomplished by bisection or by Newton's method. 

In a geometric program, the function g[{xi \ x m ) has a single root on the interval 
(0, oo). For a proof of this fact, note that making the standard change of variables 
Xi — e Vi eliminates the positivity constraint Xi > and renders the transformed 
function hi(yi | x m ) = gi(Xi \ x m ) strictly convex. Because |ctj| Sgn(ctj) = \oti\, the 
second derivative 



hi (3U I x m ) = ^ c a M [ x^ ||n||lB8n(ai) e" 
a£S + \i=l / X mi 



ll a lll Jl«l|l sgn( ai )2/i 

""■J i „H«||lE 



is positive. Hence, hi(yi | cc m ) is strictly convex and possesses a unique minimum 
point. These arguments yield the even sweeter dividend that the MM iteration map 



is continuously differentiable. From the vantage point of the implicit function theorem 
[7], the stationary condition /4(j/m+l.i I x m ) = determines y m +l.i, an d consequently 
x m+li> m terms of x m . Observe here that h\ (y m i ! x m) 7^ as required by the implicit 
function. 

It is also worth pointing out that even more functions can be brought under the um- 
brella of signomial programming. For instance, majorization of the functions — In f(x) 
and ln/(cc) is possible for any posynomial f(x) = ^2 a c a Yli=i x t z ■ in the first case, 



In fix) < — y — — y on In Xi + In 



Coibn 



(7) 



holds for a ma = c a Y[™=i x mi anc ^ ^m = Y2 a a ma because Jensen's inequality applies 
to the convex function —Int. In the second case, the supporting hyperplane inequality 
applied to the convex function — In t implies 



lnf(x) < lnf(x m ) + —1- f/(x) - f(x m )] 

j(Xm) L J 



This puts us back in the position of needing to majorize a posynomial, a problem we 
have already discussed in detail. By our previous remarks, the coefficients c a can be 
negative as well as positive in this case. Similar majorizations apply to any composition 
4> o f(x) of a posynomial f(x) with an arbitrary concave function 0(j/). 



4 Examples of Unconstrained Minimization 

Our first examples demonstrate the robustness of the MM algorithms in minimization 
and illustrate some of the complications that occur. In each case we can explicitly 
calculate the MM updates. To start, consider the posynomial 



/lO) = — + — 2 



X\X 2 



with the implied constraints x\ > and X2 > 0. The majorization @ applied to the 
third term of /1 (a;) yields 



XlX 2 < x m ix 



ir*2 



1 / Xl 



2 , x 2 

+ x -(— 

2 \x m2 



X m 2 2 . %ml 2 
-X\ + 7i x 2- 



^ml ^ x m2 

Applied to the second term of /1 (a;) using the reciprocal ratios, it gives 

3 



< 



2 — 2 

x l x 2 X i r n^- X m2 



1 ( x m i 
3 V xi 



2 I x m2 

3 l x 2 



2.7; 



in2 



x ml X2 



The sum g(x \ x m ) of the two surrogate functions 

/I \ 1 i x ml 1 , x m2 2 

91(11 \x m ) = - S + -5- -3 + ^— -XI 
X l X m2 X l zx "il 

/ I x 2a; m 2 1 x TO i 2 
g 2 (X2 I Km) = 3 + ~ x 2 

x ml %2 ^m2 

majorizes fi(x). If we set the derivatives 

ffi(a:i | as m ) = j | 4 H x i 

1 m2 1 ^ml 

1 1 I s 6x m2 1 . » m i 

32 («1 I 35m ) = ■ J H X2 

^ml 3?2 ^m2 

of each of these equal to 0, then the updates 



- q I X m\ i i I x m\ _ &\ e X m2 

Xm+1,1 - ^ 3 1-2- + 1 I— -, iBm+1,2 - W 6-3— 

\ :r m2 / ^"i 2 y x ml 

solve the minimization step of the MM algorithm. It is also obvious that the point 
x — (\/6, \/&) is a fixed point of the updates, and the reader can check that it 
minimizes f\(x). 

It is instructive to consider the slight variations 

h{&) = -j +xixl 

hi x ) = — — + x \x 2 

of this objective function. In the first case, the reader can check that the MM algorithm 
iterates according to 



- X ml J x ™2 

x m+l,l = \j~2-, x m+l,2 = {/ — — • 

x m2 y X m i 



In the second case, it iterates according to 



_ 5/ V „ _ 5 , i m2 

x m+l,l — \ — 3 — , x m+l,2 — \4- 



X 



2 



:r 



2 



'm2 V ml 

The objective function f 2 (x) attains its minimum value whenever £1X2 = 1. The MM 
algorithm for f2(x) converges after a single iteration to the value 2, but the converged 
point depends on the initial point ceo- The infimum of fs(x) is 0. This value is attained 
asymptotically by the MM algorithm, which satisfies the identities XjnxxA* = 2 ' 
and X m +i,2 — 2 ' x m 2 for all m > 1. These results imply that x m i tends to and 
x m 2 to 00 in such a manner that f3(x m ) tends to 0. One could not hope for much 
better behavior of the MM algorithm in these two examples. 
The function 

f4(x) = £12:2 — 2X1X2X3X4 + X3X4 — (x\X2 — X3X4) 



is a signomial but not a posynomial. The surrogate function (J6]) reduces to 

2 2 2 2 

„(-. I ™ \ _ X m2 4 , x rnl 4 , J£m4_ 4 , J£m3_ 4 

zx ml zx m2 zx m3 ZJ m4 

-2s m is m2 a: m 3a; m 4(lna:i +ln 2 +lnX3 + UIX4) 

with all variables separated. The MM updates 



4/ x rn i x 7n3^7nA 4/ X ra ^X rn3 X rn \ 

Xm+l.l = \/ , #m+l,2 = 1/ 

3-m2 V ^ml 



4/ ^ m 33-ml3-m2 4 / ^m4^ml^m2 

»m+l,3 = \/ 1 afm+1,4 = \/ 

y ^m4 V Zm3 

converge in a single iteration to a solution of fi(x) = 0. Again the limit depends on 
the initial point. 
The function 

f 5 {x) - x x x 2 + x x x 3 + x 2 x 3 - ln(xi +x 2 + x 3 ) 

is more complicated than a signomial. It also is unbounded because the point x with 
components x\ = m and %i = x 3 = 1/m satisfies fs(x) — 2 + m - — ln(m + 2/m). 
According to the majorization Q, an appropriate surrogate is 

/ \ / X m 2 X m3 \ 2 . / 35ml , S m 3 \ 2 . / Kml . x m2 \ 2 

g(a; asm) = I- + - )xi + I- + - )x 2 + I- + 7: ]a?3 

V2x m i 2aJmi/ V2x m2 2:r m2 / V2a; m 3 2a; m3 / 

%ml i ^m2 i 

Inii In X2 



%ml 1 X ra 2 + ^m3 ^ml ~r X rn 2 + 2)m3 

In £3 

^ml ~r X ra 2 + ^m3 

up to an irrelevant constant. The MM updates are 



x 2 

mi 

6 m+l,z 



I / jj-tJ ■E rn j)\X rn \ + X nl 2 ■+" Xm3j 

If the components of the initial point coincide, then the iterates converge in a single 
iteration to the saddle point with all components equal to l/v6. Otherwise, it appears 
that fzixrn) tends to — 00. 

The following objective functions 

fe(x) — X1X2 + x 1 a;2 — 2x 1 a;2 — X1X2 + ^>.2bx\X2 

—2x\X2 + A.5x\X2 + 3xi + "ix\X2 — 12. 75a;i 
10 9 10 10 

f 7 (x) = j2 x ^+ 2 J2 x ^ J2 x 2 + (w- 5 -o.5)j2^ 

10 1 
-(2xicr 5 )5> 1 + - 

i=7 
r / x 2 — 1 — 1 . 2-1-2-1 

f$(x) = xix 3 x 6 x 7 + XiX 3 x 5 x 6 x 7 

.32-22. -1-12. 3-3 
+xix 2 x 5 x 6 + x 2 x A Xq + x 3 x 5 x e 

fg(x) — X\x A + x 2 x 3 + XiX 2 x 3 X4 + x± a;~ 





Fun 


Type 


Initial Point xq 




Min Point 


Min Value 


Iters (10~ 9 ) 


h 


P 


(1,2) 




(1.4310,1.4310) 


3.4128 


38 


h 


P 


(1,2) 




(0.6300,1.2599) 


2.0000 


2 


h 


P 


(1,1) 




diverges 


0.0000 




h 


S 


(0.1,0.2,0.3,0.4) 


(0 


.1596,0.3191,0.1954,0.2606) 


0.0000 


3 


h 


G 


(1,1,1) 
(1,2,3) 




(0.4082,0.4082,0.4082) 
diverges 


0.2973 

-co 


2 


h 


S 


(1,1) 




(2.9978,0.4994) 


-14.2031 


558 


h 


s 


(1,...,10) 




0.0255x0 


0.0000 


18 


h 


p 


(1.....7) 




diverges 


0.0000 




h 


p 


(1,2,3,4) 


(0.3969,0.0000,0.0000,1.5874) 


2.0000 


7 



Table 1 Numerical examples of unconstrained signomial programming. Test functions /^(x), 
/g(a3), fi(x), fs(x) and fg(x) are taken from 19 . P: posynomial; S: signomial; G: general 
function. 



from the reference [19] are intended for numerical illustration. Table [T] lists initial con- 
ditions, minimum points, minimum values, and number of iterations until convergence 
under the MM algorithm. Convergence is declared when the relative change in the 
objective function is less than a pre-specified value e, in other words, when 

/(asm) - f(x m+1 ) 
\f(x m )\ + 1 

Optimization of the univariate surrogate functions easily succumbs to Newton's method. 
The MM algorithm takes fewer iterations to converge than the path algorithm for all 
of the test functions mentioned in [TU] except fe(x). Furthermore, the MM algorithm 
avoids calculation of the gradient and Hessian and requires no matrix decompositions 
or selection of tuning constants. 

As Section [7] observes, MM algorithms typically converge at a linear rate. Although 
slow convergence can occur for functions such as the test function }q{x), there are 
several ways to accelerate an MM algorithm. For example, our published quasi-Newton 
acceleration [20] often reduces the necessary number of iterations by one or two orders 
of magnitude. Figure[T]shows the progress of the MM iterates for the test function /g (a;) 
with and without quasi-Newton acceleration. Under a convergence criterion of e = 10 
and q — 1 secant condition, the required number of iterations falls to 30; under the same 
convergence criterion and q — 2 secant conditions, the required number of iterations 
falls to 12. It is also worth emphasizing that separation of parameters enables parallel 
processing in high-dimensional problems. We have recently argued J2TJ that the best 
approach to parallel processing is through graphics processing units (CPUs). These 
cheap hardware devices offer one to two orders of magnitude acceleration in many MM 
algorithms with parameters separated. 



5 Constrained Signomial Programming 

Extending the MM algorithm to constrained geometric and signomial programming is 
challenging. Box constraints a^ < Xi < foj are consistent with parameter separation as 
just developed, but more complicated posynomial constraints that couple parameters 
are not. Posynomial inequality constraints take the form 



h( X )=j2 d pii 
p i=i 



^ < i. 




12 3 4 



Fig. 1 Upper left: The test function fs(x). Upper right: 558 MM iterates. Lower left: 30 
accelerated MM iterates (q = 1 secant conditions). Lower right: 12 accelerated MM iterates 
(q = 2 secant conditions). 



The corresponding equality constraint sets h(x) = 1. We propose handling both con- 
straints by penalty methods. Before we treat these matters in more depth, let us relax 
the positivity restrictions on the da but enforce the restriction fy > 0. The latter ob- 
jective can be achieved by multiplying h(x) by x i ' for all i. If we subtract 
the two sides of the resulting equality, then the equality constraint h(x) — 1 can be 
rephrased as r(x) — ^ e 7 Ili^i x l % ~ 0' with no restriction on the signs of the e-y 
but with the requirement 7, > in effect. For example, the equality constraint 

xi x^ 
becomes 



22 2 

x l + x 2 ~ X\X-2 — 0. 

In the quadratic penalty method 12,16 with objective function f(x) and a single 
equality constraint r(x) — and a single inequality constraint s(x) < 0, one minimizes 
the sum f\(x) — f(x) + Xr(x) +\s(x) + , where s(x)+ — max{s(a;),0}. As the penalty 
constant A tends to 00, the solution vector x\ typically converges to the constrained 
minimum. In the revised objective function, the term r(x) is a signomial whenever 



10 

r(x) is a signomial. For example, in our toy problem the choice r(x) = Xi + x 2 — x\x 2 



has square 



/ \2 4 . 4 . 24 l0 22 „ 4 r, 3 2 

r(x) = xi+ x 2 + x\x 2 + lx\x 2 — lx\x 2 — lx\x 2 . 



Of course, the powers in r(x) can be fractional here as well as integer. The term s(x), 
is not a signomial and must be subjected to the majorization 

[s(x) - s(x m )] s(x m ) < 

s(x) 2 s{x m ) > 



s(x)+ < ^ , 



to achieve this status. In practice, one does not need to fully minimize f\{x) for any 
fixed A. If one increases A slowly enough, then it usually suffices to merely decrease 
f\{x) at each iteration. The MM algorithm is designed to achieve precisely this goal. 
Our exposition so far suggests that we majorize r(x) , s(x) , and [s(x) — s(x m )] in 
exactly the same manner that we majorize f(x). Separation of parameters general- 
izes, and the resulting MM algorithm keeps all parameters positive while permitting 
pertinent parameters to converge to 0. Section [7] summarizes some of the convergence 
properties of this hybrid procedure. 

The quadratic penalty method traditionally relies on Newton's method to mini- 
mize the unconstrained functions f\(x). Unfortunately, this tactic suffers from round- 
off errors and numerical instability. Some of these problems disappear with the MM 
algorithm. No matrix inversions are involved, and iterates enjoy the descent property. 
Ill-conditioning does cause harm in the form of slow convergence, but the previously 
mentioned quasi-Newton acceleration largely remedies the situation [20]. As an alterna- 
tive to quadratic penalties, exact penalties take the form \\r(x)\ + \s(x) + . Remarkably, 
the exact penalty method produces the constrained minimum, not just in the limit, 
but for all finite A beyond a certain point. Although this desirable property avoids the 
numerical instability encountered in the quadratic penalty method, the kinks in the 
objective functions f(x) + \\r(x)\ + As(je) + are a nuisance. We will demonstrate in a 
future paper how to harness the MM algorithm to exact penalization. 



6 Nonnegative Quadratic Programming 

As an illustration of constrained signomial programming, consider quadratic program- 
ming over the positive orthant. Let 

f(x) = —x Qx + c x 

be the objective function, Ex — d the linear equality constraints, and Ax < b the linear 
inequality constraints. The symmetric matrix Q can be negative definite, indefinite, or 
positive definite. The quadratic penalty method involves minimizing the sequence of 
penalized objective functions 

f x (x) = \x l Qx + c f x + ±\\(Ax - b)+f 2 + ±\\Ex- d||| 

as A tends to oo. Based on the obvious majorization 

i (a;- x m ) 2 x m < 
x+ < < „ , 

C X m > 



11 



the term ||(Aa; — b) + ||2 is majorized by |] Ax — b — »*m,||2> where 

Tm = mm{Ax m - b, 0}. 
A brief calculation shows that f\(x) is majorized by the surrogate function 

g x (x | a3m) = -x l H x x + v\ m x 
up to an irrelevant constant, where H\ and v\ m are defined by 



Q + \(A f A + £*£) 



= c-\A t {b + r m )-XE t d. 



It is convenient to assume that the diagonal coefficients \h\n appearing in the quadratic 
form Aas H\x are positive. This is generally the case for large A. One can handle the 
off-diagonal term h\ijXiXj by either the majorization @ or the majorization © ac- 
cording to the sign of h\ij . The reader can check that the MM updates reduce to 



£m-t-l,i — 



l'\, 



+ 



N 



V\ r < 



-4- 



\mi 



(8) 



where 



When/i 



Xmi 



/ J ^Xij x mj 5 
j--h M j>0 



Xmi 



/ J hXij %mj • 



= 0, the update ([8| collapses to 

X m +l,i = a; ml max| - -^-,0J. 



(9) 



At, 



To avoid sticky boundaries, we replace in equation @ by a small positive constant e 
such as 10 . Sha et al. [17] derived the update |[8j for A = ignoring the constraints 
Ex = d and Ax < b. 

For a numerical example without equality constraints take 

fw{x) = ^^l + x 2 ~ x\x 2 - 1x x - 6a; 2 



b = 



The minimum occurs at the point (2/3, 4/3) . Table [2] lists the number of iterations 
until convergence and the converged point x\ for the sequence of penalty constants 
A = 2 . The quadratic program 

/ll(as) = -8a;i - 1622 4 
A=(}D, b 




2 
•f'l 



■4xj 



1 1 
1 



converges much more slowly. Its minimum occurs at the point (2.4, 1.6) . Table [3] lists 
the numbers of iterations until convergence with (q = 1) and without (q = 0) acceler- 
ation and the converged point x\ for the same sequence of penalty constants A = 2 . 
Fortunately, quasi-Newton acceleration compensates for ill conditioning in this test 
problem. 



12 



log 2 A 


Iters 


x\ 





8 


(0.9503,1.6464) 


1 


6 


(0.8580,1.5164) 


2 


5 


'0.8138,1.4461) 


3 


23 


(0.7853,1.4067) 


4 


32 


(0.7264,1.3702) 


5 


31 


(0.6967,1.3518) 


6 


30 


(0.6817,1.3426) 


7 


29 


(0.6742,1.3380) 


8 


28 


(0.6704,1.3356) 


9 


26 


(0.6686,1.3345) 


10 


25 


(0.6676,1.3339) 


11 


23 


(0.6671,1.3336) 


12 


22 


(0.6669,1.3335) 


13 


21 


(0.6668,1.3334) 


14 


19 


(0.6667,1.3334) 


15 


18 


(0.6667,1.3334) 


16 


16 


(0.6667,1.3333) 


17 


15 


(0.6667,1.3333) 



Table 2 Iterates from the quadratic penalty method for the test function fio(x). The con- 
vergence criterion for the inner loops is 10 — 9 . 



■ A Iters (q = 0) Iters (q = 1) 



x\ 






18 


1 


2 


2 


56 


3 


97 


4 


167 


5 


312 


6 


541 


7 


955 


8 


1674 


9 


2924 


10 


4839 


11 


7959 


12 


12220 


13 


17674 


14 


21739 


15 


20736 


16 


8073 


17 


111 


18 


6 


19 


5 


20 


3 


21 


2 



(3.0000,1 
(2.8571,1 
(2.6667,1 
(2.5455,1 
(2.4762,1 
(2.4390,1 
(2.4198,1 
(2.4099,1 
(2.4050,1 
(2.4025,1 
(2.4013,1 
(2.4006,1 
(2.4003,1 
(2.4002,1 
(2.4001,1 
(2.4000,1 
(2.4000,1 
(2.4000,1 
(2.4000,1 
(2.4000,1 
(2.4000,1 
(2.4000,1 



.8000) 
.7143) 
.6667) 
.6364) 
.6190) 
.6098) 
.6049) 
.6025) 
.6012) 
.6006) 
.6003) 
.6002) 
.6001) 
.6000) 
.6000) 
.6000) 
.6000) 
.6000) 
.6000) 
.6000) 
.6000) 
.6000) 



Table 3 Iterates from the quadratic penalty method for the test function /n(aj). The con- 
vergence criterion for the inner loops is 10~ 16 . 



7 Convergence 



As we have seen, the behavior of the MM algorithm is intimately tied to the behavior 
of the objective function f(x). For the sake of simplicity, we now restrict attention 
to unconstrained minimization of posynomials and investigate conditions guaranteeing 
that f(x) possesses a unique minimum on its domain. Uniqueness is related to the 
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strict convexity of the reparameterization 

v — -v * 

Kv) = 2_u Cae<x v 



aes 



of f(x), where ex y = X]"=i a iVi i s the inner product of ex and y and Xi — e Vi for each 
i. The Hessian matrix 



d h(y) — \ c a e a v exex 



aeS 

of h(y) is positive semidefinite, so h(y) is convex. If we let T be the subspace of K" 
spanned by {a} ae g, then h(y) is strictly convex if and only if T = R™. Indeed, suppose 
the condition holds. For any » / 0, we then must have ex v 7^ for some ex € S. It 
follows that 

v d h(y)v — \ c a e a v (ex v) > 0, 

aeS 

and d h(y) is positive definite. Conversely, suppose T 7^ R", and take v 7^ with 
ex t v = for every ex € S. Then h(y+tv) — h(y) for every scalar t, which is incompatible 
with h(y) being strictly convex. 

Strict convexity guarantees uniqueness, not existence, of a minimum point. Coer- 
civeness ensures existence. The objective function f(x) is coercive if f(x) tends to 00 
whenever any component of x tends to or 00. Under the reparameterization x^ — e Vi , 
this is equivalent to h(y) = f(x) tending to 00 as ||y||2 tends to 00. A necessary and 
sufficient condition for this to occur is that max a6 s ex v > for every »/0. For a 
proof, suppose the contrary condition holds for some v 7^ 0. Then it is clear that h(tv) 
remains bounded above by h(0) as the scalar t tends to 00. Conversely, if the stated 
condition is true, then the function q(y) — max ae g ex y is continuous and achieves its 
minimum of d > on the sphere {y G M™ : \\y\\2 — !}• It follows that q(y) > cf 1 1 2/ 1 1 2 
and that 

hiy) > max{c a e a v \ > I min c a I e " y " 2 . 
aes \aes ) 

This lower bound shows that h(y) is coercive. 

The coerciveness condition is hard to apply in practice. An equivalent condition 
is that the origin belongs to the interior of the convex hull of the set {ex} ae g. 
It is straightforward to show that the negations of these two conditions are logically 
equivalent. Thus, suppose q(v) — max a£ s ex v < for some v 7^ 0. Every convex 
combination ^2 a Paex then satisfies C^2 a PaCx) v < 0. If the origin is in the interior 
of the convex hull, then ev is also for every sufficiently small e > 0. But this leads 
to the contradiction ev v — e|| f || 2 < 0. Conversely, suppose is not in the interior 
of the convex hull. According to the separating hyperplane theorem for convex sets, 
there exists a unit vector v with v ex < = v for every ex G S. In other words, 
q(v) < 0. The convex hull criterion is easier to check, but it is not constructive. In simple 
cases such as the objective function fi(x) where the power vectors are ex — (— 3, 0) , 
ex — (—1, —2) , and ex — (1, 1) , it is visually obvious that the origin is in the interior 
of their convex hull. 

One can also check the criterion q(v) > for all v 7^ by solving a related geometric 
programming problem. This problem consists in minimizing the scalar t subject to the 
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inequality constraints a y < t for all a £ S and the nonlinear equality constraint 
H2/H2 — 1- If ^min < 0, then the original criterion fails. 

In some cases, the objective function f(x) does not attain its minimum on the 
open domain K™q = {x : Xi > 0, 1 < i < n}. This condition is equivalent to the 
corresponding function \nh(y) being unbounded below on R n . According to Gordon's 
theorem 1 j..9 , this can happen if and only if is not in the convex hull of the set 
{a} a gS. Alternatively, both conditions are equivalent to the existence of a vector v 
with a v < for all a g S. For the objective function fy(x), the power vectors are 
a = (—1,-2) and a — (1, 1) . The origin (0,0) does not lie on the line segment 
between them, and the vector (—3/2, 1) forms a strictly oblique angle with each. As 
predicted, fs(x) does not attain its infimum on R"q- 

The theoretical development in reference [9] demonstrates that the MM algorithm 
converges at a linear rate to the unique minimum point of the objective function f(x) 
when f(x) is coercive and its convex reparameterization h(y) is strictly convex. The 
theory does not cover other cases, and it would be interesting to investigate them. 
The general convergence theory of MM algorithms 9 states that five properties of the 
objective function f(x) and MM algorithmic map x h-> M(x) guarantee convergence 
to a stationary point of f(x): (a) /(as) is coercive on its open domain; (b) f(x) has 
only isolated stationary points; (c) M(x) is continuous; (d) x* is a fixed point of M(x) 
if and only if x* is a stationary point of /(as); and (e) f[M(x*)] > /(as*), with equality 
if and only if x* is a fixed point of M(x). For a general signomial program, items (a) 
and (b) are the hardest to check. Our examples provide some clues. 

The standard convergence results for the quadratic penalty method are covered in 
the references [9l ll2lfT6] . To summarize the principal finding, suppose that the objective 
function f(x) and the constraint functions r,i(x) and Si(x) are continuous and that 
f(x) is coercive on R>q. If x\ minimizes the penalized objective function 

fx( x ) = f( x ) + x ^2 r i(x) 2 + \^2s j (xf + , 



and Xoa is a cluster point of x\ as A tends to 00, then a3oo minimizes f(x) subject 
to the constraints. In this regard observe that the coerciveness assumption on /(as) 
implies that the solution set {x\}\ is bounded and possesses at least one cluster point. 
Of course, if the solution set consists of a single point, then x\ tends to that point. 



8 Discussion 

The current paper presents novel algorithms for both geometric and signomial program- 
ming. Although our examples are low dimensional, the previous experience of Sha et al. 
|17j offers convincing evidence that the MM algorithm works well for high-dimensional 
quadratic programming with nonnegativity constraints. The ideas pursued here - the 
MM principle, separation of variables, quasi-Newton acceleration, and penalized op- 
timization - are surprisingly potent in large-scale optimization. The MM algorithm 
deals with the objective function directly and reduces multivariate minimization to a 
sequence of one-dimensional minimizations. The MM updates are simple to code and 
enjoy the crucial descent property. Treating constrained signomial programming by 
the penalty method extends the MM algorithm even further. Quadratic programming 
with linear equality and inequality constraints is the most important special case of 
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constrained signomial programming. Our new MM algorithm for constrained quadratic 
programming deserves consideration in high-dimensional problems. Even though MM 
algorithms can be notoriously slow to converge, quasi-Newton acceleration can dramat- 
ically improve matters. Acceleration involves no matrix inversion, only matrix times 
vector multiplication. Finally, it is worth keeping in mind that parameter separated 
algorithms are ideal candidates for parallel processing. 

Because geometric programs are ultimately convex, it is relatively easy to pose and 
check sufficient conditions for global convergence of the MM algorithm. In contrast 
it is far more difficult to analyze the behavior of the MM algorithm for signomial 
programs. Theoretical progress will probably be piecemeal and require problem-specific 
information. A major difficulty is understanding the asymptotic nature of the objective 
function as parameters approach or oo. Even in the absence of theoretical guarantees, 
the descent property of the MM algorithm makes it an attractive solution technique 
and a diagnostic tool for finding counterexamples. Some of our test problems expose 
the behavior of the MM algorithm in non-standard situations. We welcome the help 
of the optimization community in unraveling the mysteries of the MM algorithm in 
signomial programming. 
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